Resampling Technique for Imbalanced Class Handling on Educational Dataset
نویسندگان
چکیده
Educational data mining is an emerging field in mining. The need for accurate identifying student accomplishment on a course or maybe upcoming can help the institution to build technology-aided education better. becoming more important be studied because of its potential produce knowledge base model even teacher lecturer. Like another classification task, educational has common and frequently discovered problem. problem that occurred specifically tasks generally imbalanced class An condition where distribution each not same proportion. In this research, it found severely multiclass dataset consists than two labels. According stated beforehand, paper will focus handling with several methods both such as Linear Regression, Random Forest Stacking SMOTE, ADASYN, SMOTE-ENN resampling algorithm. are being evaluated using 10-fold cross-validation 80-20 splitting ratio. result shows best performance coming from ADASYN resampled ratio 0.97 F1 score. study also technique improves performance. Even though no-resampling produced decent too, caused by things general pattern already been good start. Thus, there no real drawbacks if original processed.
منابع مشابه
Resampling Imbalanced Class and the Effectiveness of Feature Selection Methods for Heart Failure Dataset
Clinical datasets commonly have an imbalanced class distribution and high dimensional variables. Imbalanced class means that one class is represented by a large number (majority) of samples more than another (minority) one in binary classification [1]. For example, in our research dataset there are 1459 instances classified as “Alive” while 485 are classified as “Dead”. Machine learning is gene...
متن کاملResampling Imbalanced Class and the Effectiveness of Feature Selection Methods for Heart Failure Dataset
Clinical datasets commonly have an imbalanced class distribution and high dimensional variables. Imbalanced class means that one class is represented by a large number (majority) of samples more than another (minority) one in binary classification [1]. For example, in our research dataset there are 1459 instances classified as “Alive” while 485 are classified as “Dead”. Machine learning is gene...
متن کاملClass-Boundary Alignment for Imbalanced Dataset Learning
In this paper, we propose the class-boundaryalignment algorithm to augment SVMs to deal with imbalanced training-data problems posed by many emerging applications (e.g., image retrieval, video surveillance, and gene profiling). Through a simple example, we first show that SVMs can be ineffective in determining the class boundary when the training instances of the target class are heavily outnum...
متن کاملSafe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem
The class imbalanced problem occurs in various disciplines when one of target classes has a tiny number of instances comparing to other classes. A typical classifier normally ignores or neglects to detect a minority class due to the small number of class instances. SMOTE is one of over-sampling techniques that remedies this situation. It generates minority instances within the overlapping regio...
متن کاملTraffic Sign Recognition System for Imbalanced Dataset
In classification problem, the most important factor is training dataset which is effect accuracy rate of classification. However, we encounter with imbalanced data set in real-world applications. In this dataset, the number of images in some classes is rather less than the number of images in other classes. So estimation of classification is tent to majority class and minority classes will be ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Jurnal Informatika: Juita
سال: 2023
ISSN: ['2579-8901', '2086-9398']
DOI: https://doi.org/10.30595/juita.v11i1.15498